Examining Associations between Views of a Data Set

نویسنده

  • Werner Stuetzle
چکیده

DM[S-8,504~359 and DOE grant Connecting viewa.is one of the techniques for the exploration of multivariate data. In this paper I review painting as a technique for connecting views. I then describe the odds plot, a new graphical display to be used in conjunction with painting. It is designed to emphasize associations between views. Permutation tests offer a way of assessing whether an observed association should be taken seriously. Examples illustrate the use of painting and odds plots. 1 Introduction Detecting structure in multivariate data is one of the fundamental problems in data analysis. Structure can have a variety of meanings: Clustering, associations between variables, presence of outliers, etc. Precisely what aspects of a data set we are interested in will depend on the underlying scientific problem. In any case, looking at a printout of hundreds or thousands of numbers is not a viable way of gaining insight. The data have to be either numerically summarized or graphically presented. Any numerical summary conveys an accurate impression of the data distribution only under certain assumptions; if these assumptions are. violated, the summary can be misleading ;. Mean vector and covariance. matrix, for example,do not provide an accurate picture ofa data setconsistingoftwo spaciallyseparated, spherical clusters. If weare in an exploratory situation, graphical presentation clearly is the better approach. Unfortunately it is not easy to visually present a multivariate data set. This difficulty is reflected in the large number of displays that have been invented (see Gnanadesikan (1977) for examples). An alternative is to give up on the attempt of squeezing all the information into a single view, and instead display multiple views of the data. Multiple views can be displayed sequentially (this is a way of thinking about rotating three dimensional scatterplots) or simultaneously. In this paper I will focus on the simultaneous display of views. Suppose we have a data set four variables,. X 4 • We could display the data in two bivariate scatterplots, for example a plot of Xz against Xl (Plot-1) and a. plot of X 4 versus X 3 (Plot-2). Every observation now is represented by two icons, one in Plot-1 and oneinPlot-2. The information contained in those two plots, however, is not in itself sufficient to reconstruct the configuration ofthe data in four dimensional space-for reconstruction we need to be able to connect the two views, Le, identify pairsofjcons{one in each plot) corresponding to the same observation. …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gender-based Differences in Associations between Attitude and Self-esteem with Smoking Behavior among Adolescents: A Secondary Analysis Applying Bayesian Nonparametric Functional Latent Variable Model

Background: Different patterns of gender-based relationships between attitude toward smoking and self-esteem with smoking behavior have reported. However, such associations may be much more complex than a simply supposed linear relationship. We aimed to propose a method of providing hand details on the total and gender-based scenarios of the relationships between attitude toward smoking and sel...

متن کامل

Examining the Associations of Covid-19 Vaccine News Sources with the Intention of Changing Adherence to Covid-19 Preventive Health Measures: A Online-Based Study in the North of Iran

Background: Although the scientific literature has extensively discussed the impact of the media on people’s health-related behaviors, there is little evidence on the effect of different sources of Covid-19 vaccine news on changing the intention to adhere to health protocols. Therefore, the present study was conducted to investigate the news sources of Covid vaccine 19 and the association of ea...

متن کامل

Views of users on factors affecting data quality of iranian electronic health record (SEPAS) in Hospitals Affiliated to Mashhad University of Medical Sciences: brief report

Background: The Electronic Health Record contains personalized health care information. Several factors affect the quality of SEPAS (Iranian electronic health record) data, disregarding the types of hospital information system set-up in hospitals. The purpose of this study was to investigate users' views on the factors affecting the data quality of Iranian Electronic Health Record (SEPAS) in ho...

متن کامل

بررسی دیدگاه اعضای هیئت علمی دانشگاه پیام نور به رویکرد یادگیری ترکیبی بر حسب متغیرهای فردی و سطح مهارت‌ رایانه‌ای

In recent decades, electronic learning has attracted remarkable attention. But due to the defects and weaknesses of merely electronic learning, the blended learning approach has gradually come to existence, and worldwide tendency to use this new approach is increasing, because of its advantages and positive features. Considering the importance of the views of scientific staff members of the ble...

متن کامل

Forecasting Stock Price Movements Based on Opinion Mining and Sentiment Analysis: An Application of Support Vector Machine and Twitter Data

Today, social networks are fast and dynamic communication intermediaries that are a vital business tool. This study aims at examining the views of those involved with Facebook stocks so that we can summarize their views to predict the general behavior of this stock and collectively consider possible Facebook stock price movements, and create a more accurate pattern compared to previous patterns...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1988